486 research outputs found
Mixed-rates asymptotics
A general method is presented for deriving the limiting behavior of
estimators that are defined as the values of parameters optimizing an empirical
criterion function. The asymptotic behavior of such estimators is typically
deduced from uniform limit theorems for rescaled and reparametrized criterion
functions. The new method can handle cases where the standard approach does not
yield the complete limiting behavior of the estimator. The asymptotic analysis
depends on a decomposition of criterion functions into sums of components with
different rescalings. The method is explained by examples from Lasso-type
estimation, -means clustering, Shorth estimation and partial linear models.Comment: Published in at http://dx.doi.org/10.1214/009053607000000668 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
We propose a novel high-dimensional linear regression estimator: the Discrete
Dantzig Selector, which minimizes the number of nonzero regression coefficients
subject to a budget on the maximal absolute correlation between the features
and residuals. Motivated by the significant advances in integer optimization
over the past 10-15 years, we present a Mixed Integer Linear Optimization
(MILO) approach to obtain certifiably optimal global solutions to this
nonconvex optimization problem. The current state of algorithmics in integer
optimization makes our proposal substantially more computationally attractive
than the least squares subset selection framework based on integer quadratic
optimization, recently proposed in [8] and the continuous nonconvex quadratic
optimization framework of [33]. We propose new discrete first-order methods,
which when paired with state-of-the-art MILO solvers, lead to good solutions
for the Discrete Dantzig Selector problem for a given computational budget. We
illustrate that our integrated approach provides globally optimal solutions in
significantly shorter computation times, when compared to off-the-shelf MILO
solvers. We demonstrate both theoretically and empirically that in a wide range
of regimes the statistical properties of the Discrete Dantzig Selector are
superior to those of popular -based approaches. We illustrate that
our approach can handle problem instances with p = 10,000 features with
certifiable optimality making it a highly scalable combinatorial variable
selection approach in sparse linear modeling
Improved variable selection with Forward-Lasso adaptive shrinkage
Recently, considerable interest has focused on variable selection methods in
regression situations where the number of predictors, , is large relative to
the number of observations, . Two commonly applied variable selection
approaches are the Lasso, which computes highly shrunk regression coefficients,
and Forward Selection, which uses no shrinkage. We propose a new approach,
"Forward-Lasso Adaptive SHrinkage" (FLASH), which includes the Lasso and
Forward Selection as special cases, and can be used in both the linear
regression and the Generalized Linear Model domains. As with the Lasso and
Forward Selection, FLASH iteratively adds one variable to the model in a
hierarchical fashion but, unlike these methods, at each step adjusts the level
of shrinkage so as to optimize the selection of the next variable. We first
present FLASH in the linear regression setting and show that it can be fitted
using a variant of the computationally efficient LARS algorithm. Then, we
extend FLASH to the GLM domain and demonstrate, through numerous simulations
and real world data sets, as well as some theoretical analysis, that FLASH
generally outperforms many competing approaches.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS375 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
The Discrete Dantzig Selector: Estimating Sparse Linear Models via Mixed Integer Linear Optimization
We propose a novel high-dimensional linear regression estimator: the Discrete Dantzig Selector, which minimizes the number of nonzero regression coefficients subject to a budget on the maximal absolute correlation between the features and residuals. Motivated by the significant advances in integer optimization over the past 10-15 years, we present a mixed integer linear optimization (MILO) approach to obtain certifiably optimal global solutions to this nonconvex optimization problem. The current state of algorithmics in integer optimization makes our proposal substantially more computationally attractive than the least squares subset selection framework based on integer quadratic optimization, recently proposed by Bertsimas et al. and the continuous nonconvex quadratic optimization framework of Liu et al. We propose new discrete first-order methods, which when paired with the state-of-the-art MILO solvers, lead to good solutions for the Discrete Dantzig Selector problem for a given computational budget. We illustrate that our integrated approach provides globally optimal solutions in significantly shorter computation times, when compared to off-the-shelf MILO solvers. We demonstrate both theoretically and empirically that in a wide range of regimes the statistical properties of the Discrete Dantzig Selector are superior to those of popular ell1-based approaches. We illustrate that our approach can handle problem instances with p =10,000 features with certifiable optimality making it a highly scalable combinatorial variable selection approach in sparse linear modeling
COVID-19 second wave mortality in Europe and the United States
This paper introduces new methods to analyze the changing progression of
COVID-19 cases to deaths in different waves of the pandemic. First, an
algorithmic approach partitions each country or state's COVID-19 time series
into a first wave and subsequent period. Next, offsets between case and death
time series are learned for each country via a normalized inner product.
Combining these with additional calculations, we can determine which countries
have most substantially reduced the mortality rate of COVID-19. Finally, our
paper identifies similarities in the trajectories of cases and deaths for
European countries and U.S. states. Our analysis refines the popular conception
that the mortality rate has greatly decreased throughout Europe during its
second wave of COVID-19; instead, we demonstrate substantial heterogeneity
throughout Europe and the U.S. The Netherlands exhibited the largest reduction
of mortality, a factor of 16, followed by Denmark, France, Belgium, and other
Western European countries, greater than both Eastern European countries and
U.S. states. Some structural similarity is observed between Europe and the
United States, in which Northeastern states have been the most successful in
the country. Such analysis may help European countries learn from each other's
experiences and differing successes to develop the best policies to combat
COVID-19 as a collective unit.Comment: Accepted manuscript. New appendix relative to v
- …